Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Research on proof of work mining dilemma based on policy gradient algorithm

WANG Tiantian, YU Shuangyuan, XU Baomin

Journal of Computer Applications 2019, 39 (5): 1336-1342. DOI: 10.11772/j.issn.1001-9081.2018102197

Abstract （479）

PDF （1022KB）（321）

Save

In view of the mining dilemma problem caused by block withholding attack under Proof of Work (PoW) consensus mechanism in the blockchain, the game behavior between mining pools was regarded as an Iterative Prisoner's Dilemma (IPD) model and the policy gradient algorithm of deep reinforcement learning was used to study IPD's strategy choices. Each mining pool was considered as an independent Agent and the miner's infiltration rate was quantified as a behavior distribution in reinforcement learning. The policy network in the policy gradient was used to predict and optimize the Agent's behavior in order to maximize miners' average revenues. And the effectiveness of the policy gradient algorithm was validated through simulation experiments. Experimental results show that the mining pools attack each other at the beginning with miners' average revenue less than 1, which causes Nash equilibrium problem. After self-adjustment by the policy gradient algorithm, the relationship between the mining pools transforms from mutual attack to mutual cooperation with infiltration rate of each mining pool tending to zero and miners' average revenue tending to 1. The results show that the policy gradient algorithm can solve the Nash equilibrium problem of mining dilemma and maximize the miners' average revenue.

Reference | Related Articles | Metrics